Challenges, Techniques and Directions in Building XSeek: an XML Search Engine
نویسندگان
چکیده
The importance of supporting keyword searches on XML data has been widely recognized. Different from structured queries, keyword searches are inherently ambiguous due to the inability/unwillingness of users to specify pinpoint semantics. As a result, processing keyword searches involves many unique challenges. In this paper we discuss the motivation, desiderata and challenges in supporting keyword searches on XML data. Then we present an XML keyword search engine, XSeek, which addresses the challenges in several aspects: identifying explicit relevant nodes, identifying implicit relevant nodes, and generating result snippets. At last we discuss the remaining issues and future research directions.
منابع مشابه
XSeek: A Semantic XML Search Engine Using Keywords
We present XSeek, a keyword search engine that enables users to easily access XML data without the need of learning XPath or XQuery and studying possibly complex data schemas. XSeek addresses a challenge in XML keyword search that has been neglected in the literature: how to determine the desired return information, analogous to inferring a “return” clause in XQuery. To infer the search semanti...
متن کاملMAXLCA: A New Query Semantic Model for XML Keyword Search
Keyword search enables web users to easily access XML data without understanding the complex data schemas. However, the ambiguity of keyword search makes it arduous to select qualified data nodes matching keywords. To address this challenge in XML datasets whose documents have a relatively low average size, we present a new keyword query semantic model: MAXimal Lowest Common Ancestor (MAXLCA). ...
متن کاملGuess What I Want: Inferring the Semantics of Keyword Queries Using Evidence Theory
The tagged and nested structure of an XML document provides quite detailed information about its structure and semantic, which is neglected by traditional keyword search model like TF-IDF and BM25 etc. Popular XML search models such as SLCA and XRANK tend to return the “deepest” node containing all given keywords, which usually leads to semantic loss. In this paper, we introduce the concept of ...
متن کاملThe Web in Ten Years: Challenges and Opportunities for Database Research
In order to evolve into a dependable and ubiquitous information infrastructure, the World Wide Web needs comprehensive quality, performance, and availability guarantees for all kinds of E-services including search engines. To improve the search result quality of search engines and to exploit the Web’s potential as a world-wide knowledge base, intensive research efforts are required that center ...
متن کاملSemantic Based XML Context Driven Search And Retrieval System
we present in this paper, a context-driven search engine called XCD Search for answering XML Keyword-based queries as well as Loosely Structured queries, using a stack-based sort-merge algorithm. Most current research is focused on building relationships between data elements based solely on their labels and proximity to one another, while overlooking the contexts of the elements, which may lea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Data Eng. Bull.
دوره 32 شماره
صفحات -
تاریخ انتشار 2009